perm filename MATH.THY[DIS,DBL] blob sn#212610 filedate 1976-04-23 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00007 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	Mathematics Research: a model research/textbook orders an example.
C00005 00003	. SSSEC(Model of Math Research)
C00016 00004	↓_Textbook-Order vs Research-Order_↓
C00024 00005	Let's make this more concrete, by considering how a mathematician might discover
C00036 00006	↓_Metaphors to other processes_↓
C00044 00007	. SSSEC(AM as Heuristic Search)
C00063 ENDMK
C⊗;
Mathematics Research: a model; research/textbook orders; an example.

Section 1 bares the underlying basis of AM: all the previously-hidden
assumptions about what constitutes "doing math research". 
First, let me
"admit" the existence of such a basis of beliefs, an interlocking set of
constraints which are (i) believed by the author, (ii) quietly interwoven 
throughout AM, and yet (iii) can't be readily tested by AM.
Much of this foundation deals with what to ⊗4ignore⊗*: incubation, subconscious,
human drives and taboos, political climate,  and factors which I'm not even aware
of ignoring.
The rest of the basis is more positive. In this section, a detailed model of
-- perhaps a recipe for -- making math discoveries will be expounded. The reader
will see that AM encorporates many of the details, in all-encompassing ways which
can't be separated from the rest of the system and examined.
For example, there is the postulate that heuristics can be useful for both
suggesting new things to consider, and for keeping that space of plausible items
pruned to a reasonable size. By the very manner that heruistic rules are
coded and used, these two functions are blended together in AM.

$$ A mathematical
theory encompasses (i) a basis of primitive objects and activities, (ii) a
foundation of axiomatic constraints on the basis, (iii) a body of definitions
tying together basic and already-defined concepts, (iv) a body of theorems which
are implied by the foundation and the definitions, (v) some interpretations in
terms of either reality or  other theories.$
. SSSEC(Model of Math Research)

In order to intelligently discuss how to automate an activity, we must be very
clear about how humans do that activity. Thus, for AM, we must hypothesize
a particular model of how mathematicians actually go about doing their research.
Thanks to Polya, Kersher, Hadamard, Skemp, many others, and introspection,
I have pieced together a tentative such information processing
model for math theory formation:

.ONCE SELECT 8 TURN OFF "α" TURN ON "∞→"

⊂∞α→⊃

.BN NARROW 2,2  SINGLE SPACE PREFACE 0

1. The order of events in a typical mathematical investigation is:
(a) OBSERVE:
The observation is either of reality or of an analogous, already-established
mathematical theory.
(b) NOTICE REGULARITY: Perceive some patterns, some interesting relationships
that appear to hold sometimes.
Thus math research is an ⊗4empirical⊗* process.
(c) FORMALIZE: Decide on some of the objects, operators,
definitions, and statements that the theory will contain.
(d) FINALIZE: Decide which concepts are to be primitve and which aren't;
decide which statements will be considered axioms, and ensure that the others
can in fact be derived from them.
(e) DEVELOP: What additional theorems can be proven from this formal system;
do they correspond to observable phenomena in the domain which motivated
this new theory?  When new observations are made in that motivating domain,
can they be naturally phrased as formal statements in this theory; and if so,
are they provable from the existing axioms and theorems?


2. 
Notice that each step in (1) involves choosing from a large set of alternatives
-- that is, searching.
The key to keeping this from becoming a blind, explosive search is the
proper use of evaluation criteria. That is, one must constantly choose the
most interesting, aesthetically pleasing, useful,... alternative available.
This is analogous to Dendral's reliance on good heuristics to constrain the
structure generator.

3. 
But many of those criteria are usually opposed to each other 
(e.g., one often must sacrifice
elegance for utility, interestingness for safety, etc.). How should one
weight these features when deciding what to do next  during an investigation?
We believe (and one goal of AM is to test) that
the non-formal criteria (aesthetics, interestingness, inductive inference (from
empirical evidence), analogy, intuitive clarity, utility)
are much more important than formal
deductive methods in developing mathematically worthwhile theories, and
in avoiding barren diversions.
Among the subjective criteria, the  order listed above is roughly their order
of importance. However, AM should have a dynamically variable "orientation",
which under certain circumstances might induce it to seek safety (e.g., utility)
rather than uncertainty (e.g., proposing an analogous proposition).

4. The above criteria are virtually the same in all domains of mathematics,
and at all levels. Each factor encourages some pursuits and discourages others.
It is hoped that no modifications need be made to AM's judgmental scheme,
as AM acquires more and more new concepts.

5. For true understanding, AM should grasp$$  Have access to, relate to, store,
be able to manipulate, be able to answer questions about $
each concept in several ways:
declarative (definition), operational (how to use it), demonic (recognizing
when it is relevant), exemplary (especially boundary examples), and
intuitive (simulated image of a real-world interpretation).

6. Progress in ⊗4any⊗* field of mathematics demands much informal heuristic
knowledge (and some
formal knowledge) of ⊗4many⊗* different "nearby" 
mathematical fields. So a broad,
universal core of intuition must be established before any single theory
can meaningfully be developed.
Intuition is contrasted with more formal representations by the fact that it
is opaque (AM cannot introspect to determine how the result is produced) and
fallable.

.WIDEN 2,2

.ONCE SELECT 8 TURN OFF "α" TURN ON "∞→"

%∞α→$

.END

.SKIP 2

This last point (6) merits elaboration. I believe that to develop any given field
of mathematics, one must possess much intuition and heuristics drawn from each
⊗4psychologically⊗* prerequisite field,$$ One which makes the new field
easier to learn, which contains more concrete analogues of the ideas of the new
field. For example, knowing about geometry makes it easier to learn about
topology, even though topology never formally uses any results or definitions
from geometry. $ and some definite facts  about each ⊗4formally⊗* preceding 
field$$ For example, arithmetic is usually formally based upon set theory,
although most of us learn about them in the other order $  of mathematics.  
The diagram here
indicates the prerequisites for each domain which might
conceivably be worked in by AM.   To start in a given domain, some knowledge
should exist about all domains which point to the given one,
about all domains which point to ⊗4them⊗*, etc.


⊗4NOTE: This section, and especially the one after it, are
supplementary, and may be skipped on first reading.⊗*

.B0 TURN OFF "α"


Elementary Logic  ααα→  Theorem-Proving  ααααααααααααααα⊃
    ↑							~
    ~							~
    ~							~
    εαααααααα→  Geometry  ααα→  Topology		~
    ~		    ~		↑      ~		~
    ~		    ~		~      ~		~
    ~		    ↓	        ~      ↓ 		~
    ~      Analytic Geometry    ~   Algebraic Topology	~
    ~		 ↑              ↓      ↑		~
    ~            ~ Measure Theory      ~		~
    ↓	         ~ ↑ 		       ~		~
Boolean Algebra αβαβαα→  Abstract Algebra 		~
    ↑            ~ ~      ~				↓
.ONCE TURN ON "α"
    ~	         ~ ↓      ~		  Program Verification
    ~	    Analysis      ↓				↑
    ~		   ↑     Concrete Algebra		~
    ~              ~      ↑				~
    ~		   ~      ~				~
    ↓		   ~      ~				~
Set Theory  ααα→  Arithmetic  ααα→  Number Theory	~
		      ~					~
		      ~					~
		      ↓					~
		Combinatorics  ←ααα→  Graph Theory  αααα$
.E


Each arrow represents either a psychological or a formal prerequisite:
To work in the field pointed ↓_to_↓, one should know something about the field
pointed ↓_from_↓.
Notice that almost all the "elementary" 
branches of mathematics are either formally
or psychologically prerequisite 
to each other.$$ For example, one should have some intuition about 
sets before doing Number theory,
and one should have some grasp of numbers 
and counting before doing Set theory. $ So a broad foundation of intuition, 
spanning ⊗4↓_several_↓⊗* mathematical and 
real-world concepts, is prerequisite to sophisticated behavior in ⊗4any⊗*
branch of mathematics.
This smacks of  the idea of a "critical mass" of knowledge,
so often sensationalized in science fiction. In addition to 
expecting that the corpus must exceed some minimum absolute size,
I am claiming that it must also exceed some minimum degree of richness, of breadth.

↓_Textbook-Order vs Research-Order_↓

Let us elaborate on point (1) of the model. The idea there is that
the order in which a math textbook presents a theory is almost the exact
opposite of the order in which it was actually discovered and developed.

.B0  INDENT 10 TURN OFF "α"

PRIMITIVES, AXIOMS ααααα→ DEFINITIONS    ←αααααααααααααα⊃
				~			~
				~			~
				↓			~
			STATEMENT OF LEMMA 		~
				~			~
				~			~
				↓			~
			  PROOF OF LEMMA		~
				~			~
				~			~
				↓			~
				~			~
				~			~
				↓			~
		     STATEMENT OF NEXT LEMMA		~
⊂αααααααα⊃ 			~			~
~textbook~      		|			~
~ order  ~			|			~
%αααααααα$			|			~
				|			~
				|			~
				↓			~
		       PROOF OF LAST LEMMA		~
				~			~
				~			~
				↓			~
		      STATEMENT OF THEOREM		~
				~			~
				~			~
				↓			~
			PROOF OF THEOREM ααααααααααααααα$

.E

In a textbook, one introduces some primitive elements of the theory, postulates
some axioms relating these objects and operations, then enters a 
⊗4"Define → State → Prove⊗*" loop. Lemmas are stated and proved before the
theorems which require them. Motivations of any kind are a rareity, and often
are mentioned parenthetically as an application of a theorem which has just been
proved.  

As an example, here is how my sophomore textbook (Introduction to
Abstract Algebra) was organized:

.B0
.INDENT 10
.TURN ON "α"

Set N, Op ⊗1Successor:N→N,⊗*
Peano⊗1'⊗*s Axioms   αααααααααααα→   Define +                      
				~			
				↓			
		    State some property of +   		
				~			
				↓			
		      Prove this propterty		
				|			
				|		
				|			
		           Define ⊗6≤⊗*
				|			
				|			
				|			
		           Define g.l.b
				~			
				~			
				↓			
		   State that glb always exists	
				~			
				↓			
			Prove this claim
				|
				|
				|
			  Define  TIMES
			  Define ⊗5Z⊗*
				|
				|
				|
			  Define  DIVIDES
			  Define  PRIME
				~
				↓
		     State Euclid⊗1'⊗*s Algorithm
				~
				↓
		     Prove Euclid⊗1'⊗*s Algorithm
.E

It began by presenting Peano's axioms, completely unmotivated, then entered
the the Define → State → Prove loop.  For example, the two concepts of
⊗4prime⊗* and ⊗4non-factorizable number⊗* are defined separately,
even though a hundred pages elapse before any mathematical system is presented
in which the two notions don't completely coincide.

This psychic ordering is repeated, 
in microcosm, in each textbook proof. For example,
a typical epsilon/delta argument will start by magically proposing some function
delta(epsilon), which is just the right one for the approximations taken later.

In contrast, 
a mathematician doing research works in almost the opposite order.

.B0 INDENT 4 TURN OFF "α"

OBSERVE x  αααα→  NOTICE SOME INTERESTING REGULARITIES  
				 ~
				 ~
				 ↓
	     GATHER UP ALL RELATED INTERESTING RELATIONSHIPS
				 ~
				 ~					⊂αααααααα⊃
				 ↓					~research~
		        STATE THESE FORMALLY				~ order  ~
				 ~					%αααααααα$
				 ~
				 ↓
		CHOOSE THE MOST INTUITIVELY OBVIOUS OF THESE
				 ~
				 ~
				 ↓
		 TRY TO DERIVE ALL THE REST FROM THIS CORE
				 ~
				 ~
				 ↓
       IF THE CORE IS INCONSISTENT, ELIMINATE THE LEAST BELIEVED MEMBERS
       IF ANY CAN⊗6'⊗*T BE DERIVED: ENLARGE THE CORE SO THAT THEY CAN BE
       IF ANY MEMBER OF THE CORE CAN BE DERIVED FROM THE OTHERS, ELIMINATE IT
				 ~
				 ~
				 ↓
	   CALL THE CORE "AXIOMS", AND THE REST "THEOREMS"
				 ~
				 ~
				 ↓
       TRY TO DERIVE PREVIOUSLY UNSUSPECTED THEOREMS, FORMALLY
		IF SO: TRY TO REINTERPRET IN TERMS OF x
       TRY TO FIND NEW RELATIONSHIPS OCCURRING IN x
		IF SO: TRY TO PROVE THEM USING THE EXISTING AXIOMS AND THEOREMS
.E

The researcher begins by observing the world: he looks at scenarios from reality
and perhaps also some earlier, already-developed, interesting mathematical theories.
He notices some interesting facts, either by raw perception of regularities
or by analogy to some interesting pattern in another theory. Of these observations,
he chooses a small set of the most intuitively clear ones, and tries to derive the
other observations formally from this chosen core. After several passes, he will
succeed; the final chosen core he calls axioms, and the rest he calls theorems.$$
Once axiomatized, new theorems may be proposed syntactically: the researcher then
may check each conjecture against the roots of the theory (both real-world and
analogous). Additional observations can later be checked against the theory,
to see if they also can be derived from the axioms. $

Let's make this more concrete, by considering how a mathematician might discover
and develop the simplest numerical concepts, the ones my textbook introduced
and developed in such a polished way.

Assume that he possesses our previously-mentioned prenumerical background, but
nothing more.
He begins by considering the two activities:
Comparing piles of marbles using a pan balance, and comparing the heights of
towers constructed from toy blocks.

.SKIP TO COLUMN 1

.B0 INDENT 7 TURN OFF "α"

	       /~\				       /~\
	      /	~ \				      /	~ \
	     /	~  \				     /	~  \
	    /	~   \				    /	~   \
	   /    ~    \				   /    ~    \
	  /	~     \				  /	~     \
      	 /	~      \    		     oo	 /	~      \  oo
    αααα/	~	\αααα		    αααα/	~	\αααα
		~					~
		~					~
		~					~
    αααααααααααα∀αααααααααααα		    αααααααααααα∀αααααααααααα

			       /~\
			      /	~ \
			     /	~  \
		 	    /	~   \   o
			   /    ~    \αααα
			  /	~     			⊂αααααααα⊃
		      	 /	~      			~weighing~
		        /	~			~ marbles~
		       /	~			%αααααααα$
		 ooo  /		~
		 αααα/		~
				~
		    ααααααααααααααααααααααααα

    ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡

	  αααααα⊃
	        |
	⊂ααααα⊃ |
	~     ~ |
	~     ~ |					⊂αααααα⊃
	εαααααλ |					~piling~
	~     ~ |		  αααααα⊃		~blocks~
	~     ~ |		  	|		%αααααα$
	εαααααλ |		⊂ααααα⊃ |
	~     ~ |		~     ~ |
	~     ~  		~     ~ 
	%ααααα$     		%ααααα$
.E

.SKIP TO COLUMN 1

When he tires of playing, the mathematician makes a list of the most basic
features of these situations. 


.BEGIN NOFILL SELECT 1 PREFACE 0 INDENT 0 TABS 42,82 TURN ON "\" GROUP

.TURN ON "→∞"

.ONCE CENTER PREFACE 2

⊗5FUNDAMENTAL CONCEPTS⊗*

↓_ABOUT WEIGHING MARBLES:_↓\↓_ABOUT PILING BLOCKS:_↓ →↓_COMMON NAME:_↓

There are some marbles to play with.\There are some blocks to play with.→element

The marbles are indistinguishable.\The blocks are indistinguishable.→set

Besides individual marbles, we can\Besides individual blocks, we can→variable
talk about a pile of marbles as a unit.\talk about a tower of blocks as unit.

We can copy a given pile.\We can copy a given tower.→copy

A pile of marbes and a copy of\A tower of blocks and a copy of
that pile are indistinguishable.\that tower are indistinguishable.→ignore

To make pile heavier, add a marble.\We can stack another block onto a tower.→successor

Lighten pile by removing a marble.\Can shorten tower by removing top block.→predecessor

We can tell if piles x,y balance.\Can see if towers x,y are same size.→equality

Can tell if pile x is heavier than y.\We can see if tower x is higher than y.→⊗6>⊗*

We can add one pile to another.\We can stack one tower on top of another.→add

Can remove a pile from a big one.\Can lift a tower off top of a big one.→subtract

Can remove any marble from pile.\We can only remove topmost blocks.→delete/pop

Pile might have no marbles.\There might be no blocks in a tower.→empty set

Can replace each marble in a pile\We can replace each block in a tower
by a copy of some other pile.\by a copy of some other tower.→multiply

A lone marble is also a pile.\A lone block is also a tower.→x↔{x}

etc.\etc.→etc.

⊗8∞≡→⊗*

.E

These correspond to the objects and operators of the theory he is about to develop.
He gives them each a name; I have permitted him to choose the common name for
each concept$$
For brevity, he will call the
empty set 0, and a singleton set 1. + will signify addition, etc. 
S(n) is the successor of n (that is, n+1), and D(n) is the predecessor of n
(decrement n; that is, n-1).
$,
but that doesn't matter, of course. Notice that most of these
are not usually considered primitive at all. 

The next list he draws up contains several specific observations, translated from
the blocks/marbles manipulation language into these new symbols.
Each relationship is translated from these symbols into the
⊗4other⊗* scenario's language, and checked. Thus all the observations here are
meaningful in both of the manipulative domains:

.COMMENT BEGIN NOFILL SELECT 6 PREFACE 0 INDENT 0 TABS 15,30,45,60,75 TURN ON "\" GROUP;
.WBOX(0,0) TABS 15,30,45,60,75 TURN ON "\" GROUP
.COMMENT ⊗8≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡⊗*;
MBOX $
MBOX 0=0\1=1\¬(0=1)\¬(1=0)\¬(S(1)=1)\¬(0=S(0)) $
MBOX $
MBOX S(0)=1\1=S(0)\S(0)=S(0)\S(1)=S(1)\¬(S(0)=S(1))\¬(S(1)=S(0)) $
MBOX $
MBOX 0+0=0\1+1=S(1)\0+1=1\1+0=1\S(1)+0=S(1)\S(0)+S(0)=S(1) $
MBOX $
MBOX 0⊗7x⊗*0=0\1⊗7x⊗*1=1\0⊗7x⊗*1=0\1⊗7x⊗*0=0\S(S(1))⊗7x⊗*0=0\S(S(1)⊗7x⊗*1=S(S(1)) $
MBOX $
MBOX 1>0\S(1)>1\S(1)>0\¬(0>1)\¬(1>1)\¬(0>0) $
.EBOX
.COMMENT ⊗8≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡⊗*;
.COMMENT E
.COMMENT if the box looks nicer than just ≡≡≡, then use it allover this section. ;


Now comes some simple generalization, either directly from the scenarios,
perceptually from the list just compiled, or syntactically from that list
(e.g., by replacing constants by variables). 
Again,  each of these generalizations
is intuitively checked in both real-world domains.

.BEGIN NOFILL SELECT 6 PREFACE 0 INDENT 0 TABS 15,30,45,60,75 TURN ON "\" GROUP

⊗8≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡⊗*

If n>0, then  D(n) can be performed; else D(n) cannot be performed.

If n>0, then n>D(n)\S(n)>n\\S(n)=n+1\S(S(n))>n

If n=m then S(n)=S(m)\\\If S(n)=S(m) then n=m

D(S(n))=n\\If n>0 then S(D(n))=n\If n>0 then n+m>m

m+S(n)>m\\If n=m then ¬(n>m)\If n>m then ¬(n=m)

If n=m ∧ n>r then m>r\n=n\¬(n>n)\¬(n=S(n))

If n>m ∧ m>r then n>r\If n=m ∧ m=r then n=r\If n=m ∧ r=m then r=n

If n=m then m=n\If n>m then ¬(m>n)\n>m ∨ m>n ∨ n=m

n⊗7x⊗*m = m⊗7x⊗*n\\n⊗7x⊗*S(m)=n⊗7x⊗*m + n\\n+m=m+n

n+S(m)=S(n)+m\\(n+m)+(r+s) = n+((s+m)+r)

n⊗7x⊗*(m⊗7x⊗*r)=(n⊗7x⊗*r)⊗7x⊗*m\S(n)+D(m)=m+n

etc.

⊗8≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡⊗*

.E

Some of these are now considered as axiomatic, as defining the operations and
structures involved; the rest are considered secondary, as theorems following from
the axioms. This process is a slow search.
The most important game is not "finding minimal axiom sets", but rather
"what else can we prove from this set of axioms and theorems".
The AM research goal is to learn how to play this game of theory development.

To summarize idea (1):
The ⊗4motivation⊗* for a new mathematical development is either:
(a) looking at reality and at the mathematics
you have already$$ Then abstract out some interesting features; 
formalize those into a
specific set of primitive structures and relations  and axioms about those
basic entities $, or (b) given an existing theory, propose some new conjecture or
definition, based on analogy to the real-world roots of this theory,$$ Based
on patterns perceived in empirical
evidence $  or based on analogy to a known interesting theorem in another theory,
or occasionally based only on applying some fruitful method and observing the
result.

↓_Metaphors to other processes_↓

Let's consider some alternate ways of looking at such theory development, some
analogies to processes with which you're probably more familiar.

.BN

λλ Growing a tree -- using heuristic constraints:
Once motivated, the idea is developed and evaluated. At each moment, the
researcher should be working on the most promising of all the proposed ideas.
This process is thus a sophisticated expansion of a tree,
where new conceptual nodes are
"grown" in the most promising area, and where barren dead-ends are eventually
"pruned". To do mathematical research well, it is thus necessary and sufficent
to have good methods for proposing new concepts from existing ones, and for
deciding how interesting each candidate is. That is: effective growing and pruning
strategies.

λλ Exploring a space using an evaluation function:
Consider our core of premathematical knowledge.
By compounding the known concepts and methods,$$Using 
abstraction from reality, analogy with existing theories,
the postulational method, and problem-solving techniques $
we can extend this
foundation a little wherever we wish.
⊗4<Visualize: SEVERAL SHORT LINES IN BLACK, EMANATING
FROM A CENTRAL CORE>⊗*.
The incredible variety of alternatives to investigate includes
all known mathematics, much trivia, countless deadends, and so on.
The only "successful" paths near the core 
are the narrow ribbons of known mathematics 
(perhaps with a few undiscovered other slivers). 
⊗4<Visualize SNAKE-LIKE LINES IN RED, twisting away from the core, intersecting>⊗*.

.ONCE INDENT 4

How should we walk through this immense space, with any hope of following the
few, slender branches of already-established mathematics (or some equally successful
new fields)? We must do hill-climbing; as new concepts are formed, decide how 
promising they are, always explore the currently most-promising new concept. The
evaluation function is quite nontrivial, and this research may be viewed as an
attempt to study and explain and duplicate the judgmental criteria people employ.
My attempts at codifying such "mysterious" emotive forces as intuition, aesthetics,
utility, richness, interestingness, relevance... indicate that a large but not
unmanagable collection of heuristic rules should suffice.

.ONCE INDENT 4

The important visualization to make 
is that with proper evaluation criteria, we convert the
flat picture  to
a breath-taking relief map:
the known lines of development become
mountain ranges, soaring above the vast flat plains 
of trivia and inconsistency
below.
Occasionally an isolated
hill is discovered near the core;$$ E.g.,
Knuth's ↓_Surreal Numbers_↓ $ certainly
whole ranges lie undiscovered for long periods of time:$$ E.g.,
non-Euclidean geometries $ the terrrain far from the initial core is
not ⊗4at all⊗* explored yet.

.ONCE INDENT 4

Intuition is like vision, letting the explorer observe a distant mountain,
long before he has conquered its intricate, specific challenges.

.ONCE INDENT 4

If the criteria for evaluating interestingness and promise are good enough, then
it should be straightforward to simply push off in any direction, locate some
nearby peak, and follow the mountain range along (duplicating the development in
some field). In fact, by intentionally pushing off in apparently barren directions,
new ranges might be enountered. If the criteria are "correct", then ⊗4any⊗* new
discovery the system makes and likes will necessarily be interesting to humans.

.ONCE INDENT 4

If, as is more likely, the criteria are deficient, this too will tell us much. Before
beginning, we shall strive to include all the obvious factors which enter into
judgmental decisions, with appropriate weights, etc. If the criteria fail, then
we can analyze that failure and learn about one nonobvious factor in evaluating
success in mathematics (and in any creative endeavor). After modifying the criteria
to include this new factor, we can proceed again.

λλ Syntax and Semantics in Natural Language Processing:
Consider assumptions, axioms, definitions, and
theorems to be 
syntactic rules for the
language that we call Mathematics. 
Thus theorem-proving, and the whole of textbook mathematics, is a purely syntactic
process.
Then these judgmental
criteria I am alluding to would correspond to the semantic knowledge associated
with these more formal methods.
Just as one can upgrade natural-language-understanding by encorporating semantic
knowledge, so AM ensures that it knows the intuitive and empirical roots of
each concept in its repertoire.

.E

. SSSEC(AM as Heuristic Search)

.ONCE TURN ON "{}"

As the title of this section -- and this thesis -- proclaims, AM is a
kind  of  "heuristic search"  program.   That  must mean  that  AM is
exploring  a  particular  "space,"  using  some  informal  evaluation
criteria   to   guide   it.     Sections   {SECNUM}.{SSECNUM}.2   and
{SECNUM}.{SSECNUM}.3 will present this view in detail.  First, let me
define what a heuristic search is. Readers familiar with this concept
may skip the first subsection entirely.

. SSSEC(Heuristic Search)

Heuristic gourmets  sometimes find  it useful  to classify  heuristic
searches  based on  their taste,  but for  our modest  purposes we'll
stick to just one flavor$$ Rocky road. $.

The activity we  wish to model  using the search  must be phrased  in
terms  of  states  and  operators.  The   operators  are  capable  of
transforming  some  states into  different ones.  Given  some initial
configuration of states, the  "game" is to repeatedly apply  opertors
to states.  The objective will either  be a particular state,  or any
state satisfying  some well-defined criteria.  Finally, some rules of
thumb must  be  present, to  guide  the player  as he  chooses  which
operator to apply to which state next.

This is  a rather confused presentation;  a superior one  is found in
Chapters...  of  [Nilsson]. Perhaps a  more graphical  interpretation
will be  helpful.   A  graph is  a  bunch of  little circles,  called
⊗4nodes⊗*, which represent  the states of the search, plus a bunch of
arrows,  called  ⊗4arcs⊗*,  which   represent  applications  of   the
operators. 
Each node has a name, which is written inside its little circle, and
each arc may have a label, which is written alongside it.
If an arc points from node  X to Y and is labelled F, then
we  assume  that opeator  F  was applied  to  state X,  and  that the
resultant  state  was  Y.    

.B0

	    ⊂∂∂∂∂∂∂∂⊃	
	    ~	    ~
	    ~	X   ~
	    ~	~   ~
	    ~	~   ~
	    ~	~ F ~
	    ~	~   ~
	    ~	↓   ~
	    ~ 	Y   ~
	    ~	    ~
	    %∂∂∂∂∂∂∂$

.E


The  whole  idea  of  "search"  in  this
interpretation is to carefully enlarge the graph (to keep drawing new
arcs and nodes).


Each node can have several arcs pointing to it; each arc can point to
several different nodes; and each arc can point ⊗4from⊗* a collection
of  different nodes.   Thus an  arc is like  an arrow which  can have
multiple heads and tails:

< Diagram of a graph >

.B0

A			Z	   W
~\		       /	  /
~ \	B←∂∂∂∂∂∂∂∂∂∂∂∂∂∂∂C	 /			Q
~  \    ↓		 ↑	/		       /
~   R∂∂→D∂∂∂∂∂∂→E∂∂∂∂∂∂∂→H←∂∂∂∂∂∂∂∂∂∂∂L∂∂∂∂∂∂∂∂∂∂∂∂∂∂∂∂
~  /						       \
~ /							\
↓/							 G
U								

.E

Some more terminology may come in handy here.  If an arc  points from
X to  Y, we  say that  Y is  a ⊗4successor⊗* of  X, and  that X  is a
⊗4parent⊗*  of Y. If a chain of arrows leads  from X to Y... to Z, we
say that Z is a ⊗4descendant⊗* of X, and that X is an ⊗4ancestor⊗* of
Z.  If there is one function which can generate all successors of any
given node, we call  that the ⊗4successor  function⊗* of the  search.
Any function  which is  used to  informally rate  the "value" of  any
given node is called a ⊗4heuristic evaluation function⊗*.

Let's give a few examples.

<<Rewrite this chess example. >

In a chess-playing program, a natural choice is to make each possible
board position correspond  to a state. The  operators are simply  the
legal  moves of  the  game.  So there  would  be  an operator  called
"Move-rook",  which  could  transform  a  given  state  into a  large
collection of new ones, which differed from the original  one only in
that  one rook  had been  (legally) moved  along some  rank  or file.
Sometimes, this operator would yield ⊗4no⊗* successors (e.g., if  one
rook  is gone  and  the  other is  pinned  against  the king).    The
heuristic evaluationfunction would  be a function which looked at any
board situation, and counted material, examined control of  key areas
of the  board, and in other  ways somehow pieced together  an opinion
about just  how good this board really is (from the standpoint of one
particular  player).   Each  player  might have  his  own  evaluation
function, or  they might both  use the same  one, or they  might vary
which evaluation function they use  depending on recent moves,  their
opponent's   reputation,  random   chance,  etc.      As  with   many
non-chess-playing  humans,  there is  no  sense of  a  definite goal,
except just  prior to  mating.  Rather, the  typical approach  is  to
generate some of the successors to the current board situation (using
heuristic  guidance  to  only  generate  plausible  moves),  and then
evaluate each successor state using the evaluation function. The best
of  these states would  then be  expanded in turn  (i.e., all  of the
opponent's relies would be considered).  If the opponent had  ⊗4any⊗*
good reply to one of the moves, that move  would then have its rating
lowered  accordingly. This process --  which is called ⊗4minimaxing⊗*
-- would be continued as long as time permitted. When forced to halt,
the successor of the current state  which had the highest value would
be  chosen as the  actual successor (i.e., the  corresponding move to
alter the  board to  that state  would then  be made),  and it  would
become you oppenent's turn.


<< Give a couple more examples of heuristic search. >

In theorem-provers,  each operator is a rule  of inference.  Applying
an operator adds one more wff  to the list of known true  statements.
So in some  sense ⊗4all⊗* paths lead to  a "goal", the problem  is to
find a  short, direct path. A solution is "good"  if it consists of a
very short path leading to an  acceptable goal node.  The quality  of
the path is very important.


. SSSEC(AM's Similarities to Heuristic Search)

AM explores mathematics by selectively enlarging  itself: AM ⊗4is⊗* a
body of mathematical knowledge (concepts, plus the wisdom to use them
effectively).  AM is a heuristic search program, much like  the chess
and theorem-proving  programs just  desribed.  To  see this,  we must
explain  what the nodes of AM's search  space are, what the operators
or links are,  where the heuristic  information comes into play,  and
what the evaluation function is.

AM's space can be consisered to consist of a bunch of nodes which are
each a facet/concept pair (a primitive task on AM's agenda of  things
to do).   In  one sense, the  successor function  is "EVAL" --  i.e.,
given a task,  execute it.  In a more meaningful sense, the operators
are the heuristic rules that AM uses to suggest new tasks  and create
new concepts.   While a task is executed (a  node is being expanded),
several new  tasks may get suggested (other nodes will be pointed to:
the successor nodes of the current one).  Each arc  in the space will
thus be a reason for executing the node it points to.  When a task is
suggested (by some heuristic),  it is tagged with  a list of  reasons
explaining why it  might be worthwhile. The evaluation  function need
only  scan all the nodes  (all the facet/concept  pairs), and examine
each's list of reasons.   In fact, experiments show that the  precise
form of  the evaluation function  is unimportant$$ See  experiment 3,
page...,  Section... $.  AM has a  definite algorithm for rating ithe
nodes of its space, and yet certainly has no  specific goal criteria:
it can't quit just because a dynamite task has been proposed. AM goes
on forever$$ Technically, forever is about 100,000 list cells $.


This is probably unclear, so perhaps a picture will help:

<<Draw a graph: AM as heur search. >

Notice that... <<Things to notice about that graph. >



. SSSEC(AM's Differences from Heuristic Search)

There are several difficulties  and anomalies in forcing AM  into the
heuristic search paradigm.  For example, the main entities that AM is
concerned with are concepts,  and yet the  nodes of its search  space
are  merely  pointers to  particular  empty  facets.   In  a  typical
heuristic search, the nodes are intimately ties up with the intuitive
"states" of the situation.

Another anomaly is  that the operators which  AM uses to enlarge  and
explore  the space of  concepts are themselves  mathematical concepts
(e.g., some heuristic rules result in the creation of new ones). Thus
AM should be viewed as a mass  of knowledge which enlarges ⊗4itself⊗*
repeatedly.    As  far  as I  know,  all  previous  systems kept  the
information they "discovered" quite  separate fromthe knowledge  they
use to make discoveries$$ Of course this is typically because the two
kinds of knowledge are very different: For a chess-player, first kind
is "good board positions," and the second is "strategies for making a
good move." $.

Let me  reemphasize at this time  that AM has  no well-defined target
concepts or target relationships.  Rather, its "goal criteria" -- its
sole  aim  --  is  to  preserve  the  interestingness  level  of  the
activities  it performs, of  the top tasks  on the agenda.   We don't
care what AM does --  or misses -- so long  as it spends its time  on
plausible tasks.   There is no  fixed set of theorems  that AM should
discover (AM is thus not like a typical ⊗4problem-solver⊗*), no fixed
set  of traps  AM  should  avoid  (AM is  thus  very  different  from
⊗4game-players⊗* like chess programs).

For example,  there is no stigma  attached to the fact  that AM never
discovered real  numbers$$  There are  many  "nice" things  which  AM
didn't -- and can't -- do: e.g., devising ↓_geometric_↓ concepts from
its initial simple set-theoretic knowledge.  Paul Cohen has indicated
that this would be a supremly exciting development: the  invention of
geometry -- and  all the tools it  provides -- by a  system which has
neither  geometric  intuition  built  into  it,  nor  definitions  of
geometric concepts.  I do not think this possible; see the section on
the  limitations of AM,  page...$; it  was rather surprising  that AM
managed to  discover ⊗4natural⊗*  numbers!   Even if  it hadn't  done
that, it would have been just as  acceptable$$ Acceptable to whom? Is
there really a  domain-invariant criterion for judging the quality of
AM's actions? See the  discussion in Section...,  on Page... $ if  AM
had simply gone off and developed ideas in set theory.

. SSSEC(Summary: Agendas and Heuristic Search)

<< Why the agenda mechanism is well-suited  to heur search algorithms
>

Consider  Nilsson's  description  of  depth-first  searching, and  of
breadth-first searching. He has us  maintain a list of "open"  nodes.
Repeatedly, we pluck the top one  and expand it. In the process, some
new  nodes may be added to the Open  list. In the case of depth-first
searching, they are added at the top; for  breadth-first search, they
must be  added at the  bottom. For heuristic  search, or "best-first"
search, they are  evaluated in  some numeric way,  and then  "merged"
into the already-sorted list of Open nodes.

This process is  clearly analogous to  the agenda mechanism  AM uses.
It  is also the  same as  the one used  in KRL [reference],  and used
earlier in Dendral$$ The "Predictor" part of DENDRAL. See [MI4 ref.].
$  The main  difference  is that  in AM,  symbolic  reasons are  used
(albeit in trivial token-like ways) to decide whether -- and how much
-- to boost the priority of a task when it is suggested again.

Agendas are the canonical  kind of data structure for  a "best-first"
process.